Elsevier

Environment International

Volume 80, July 2015, Pages 8-18
Environment International

Review
Beyond QMRA: Modelling microbial health risk as a complex system using Bayesian networks

环境科学与生态学TOPEI检索SCI升级版 环境科学与生态学1区SCI基础版 环境科学与生态学1区IF 10.3
https://doi.org/10.1016/j.envint.2015.03.013 Get rights and content
Full text access

Highlights

  • Bayesian networks are emerging as a valuable technique in microbial risk assessment.
  • Fifteen applications in food and water health risk assessment are reviewed.
  • BNs are flexible and have diverse applications.
  • The method overcomes problems where data are sparse, missing or poor in quality.
  • BNs offer significant untapped potential in quantifying microbial exposures.

Abstract

Background

Quantitative microbial risk assessment (QMRA) is the current method of choice for determining the risk to human health from exposure to microorganisms of concern. However, current approaches are often constrained by the availability of required data, and may not be able to incorporate the many varied factors that influence this risk. Systems models, based on Bayesian networks (BNs), are emerging as an effective complementary approach that overcomes these limitations.

Objectives

This article aims to provide a comparative evaluation of the capabilities and challenges of current QMRA methods and BN models, and a scoping review of recent published articles that adopt the latter for microbial risk assessment. Pros and cons of systems approaches in this context are distilled and discussed.

Methods

A search of the peer-reviewed literature revealed 15 articles describing BNs used in the context of QMRAs for foodborne and waterborne pathogens. These studies were analysed in terms of their application, uses and benefits in QMRA.

Discussion

The applications were notable in their diversity. BNs were used to make predictions, for scenario assessment, risk minimisation, to reduce uncertainty and to separate uncertainty and variability. Most studies focused on a segment of the exposure pathway, indicating the broad potential for the method in other QMRA steps. BNs offer a number of useful features to enhance QMRA, including transparency, and the ability to deal with poor quality data and support causal reasoning.

Conclusion

The method has significant untapped potential to describe the complex relationships between microbial environmental exposures and health.

Abbreviations

BN
Bayesian network
CFU
colony-forming unit
DAG
Directed Acyclic Graph
FIB
faecal indicator bacteria
MC
Monte Carlo
MCMC
Markov chain Monte Carlo
MPN
most probable number
MPRM
modular process risk model
QMRA
quantitative microbial risk assessment

Keywords

Bayesian network
Health risk assessment
Microbial risk
Modelling
QMRA
Uncertainty

1. Introduction

Quantitative microbial risk assessment (QMRA) is an established framework for assessing public health risks from pathogenic organisms (Haas et al., 1999). As a relatively recent addition to the risk analysis field (Havelaar et al., 2008), the QMRA methodology is evolving, and current challenges suggest scope exists to augment established methods to improve its capabilities. Bayesian networks (BNs) are emerging as an attractive way of modelling ‘wicked’ problems, particularly in complex environmental systems (Aguilera et al., 2011, Barton et al., 2012, Uusitalo, 2007). In essence, a BN is a flexible graphical model that incorporates dependencies among its variables via probabilistic relationships. The method allows the integration of a range of quantitative information, which is particularly useful in environmental domains such as QMRA, where traditional experimental and observational data are missing, inaccurate, sparse or costly (Aguilera et al., 2011). The use of BNs is increasing exponentially across a wide range of application domains (Aguilera et al., 2011, Barton et al., 2012, Korb and Nicholson, 2011). Despite this increased interest and a growing body of literature, there has not yet been a critical review and evaluation of the approach in the context of QMRA across domains, that draws together the literature and can be used to educate and guide practitioners. This paper aims to fill that gap by exploring the range of applications of BNs in the microbial risk assessment domain.
We begin with a background of current challenges in QMRA and a description of BN models and their features in more detail. We then examine published examples of the use of BNs in QMRA and discuss the applications of the method to assess and describe human health risk in different exposure domains. Finally, we discuss advantages and limitations of the approach in the context of QMRA and attention is drawn to gaps in the reporting of current research in the area.

2. Background

2.1. Quantitative microbial risk assessment

QMRA is a structured approach which brings information and data together with mathematical models to examine the exposure and spread of microbial agents and to characterize the nature of the adverse outcomes (United States Environmental Protection Authority and United States Department of Agriculture/Food Safety and Inspection Service, 2012). The six steps of a QMRA, illustrated in Fig. 1, are hazard characterization, exposure assessment, dose–response assessment, risk characterization, risk management and risk communication (National Research Council, 2009). A typical example of the output from a QMRA model is the probability of infection or illness associated with ingestion of food or water containing pathogens. This may then be used to predict the number of cases of illness caused by pathogens introduced into a food production chain, or the incidence of waterborne disease in a population of interest (Smid et al., 2011).
  1. Download: Download full-size image

Fig. 1. The six steps of a quantitative microbial risk assessment.

Despite their widespread use, established QMRA methods have some inherent constraints. For a feasible QMRA, the minimum amount of data needed to calculate microbiological risk is the infective dose, calculated from the pathogen concentration in the exposure medium and the quantity of medium implicated in the exposure (Soller et al., 2004, Thoeye et al., 2003). A significant limitation in QMRA however, is the lack of appropriate data to quantify established models in the dose–response step. The availability of pathogen-specific dose–response models is limited, due in part to the cost and ethical dilemmas of human feeding experiments. Where it does exist, dose response information has limitations which include the use of healthy volunteers and/or attenuated organisms in experiments, and omission of low doses from experiments (Rose et al., 2008), the significant effect of differential susceptibility in the host (Soller, 2008) and the definition of response, which may vary from faecal excretion to antibody response and sometimes symptomatic illness (Rose et al., 2008).
A second key issue arises in measuring the many facets of microbial exposure. For example, a quantitative assessment of the overall health risk of exposure to a pathogen-containing medium such as water, air or soil, taking into account all potential exposure routes, may be infeasible, as the data relating to exposure routes such as inhalation or dermal contact for example, are scarce or not available. Furthermore, in a scenario where exposures occur via multiple routes simultaneously, it is difficult or even impossible to determine which exposure route is responsible for what proportion of the risk. As another case in point, the apparently straightforward process of the detection and quantification of microscopic pathogens is subject to multifarious influences (Crainiceanu et al., 2003). For example, due to the low numbers (tens or hundreds) of microorganisms entailed in exposure from water, there may be large differences with respect to the actual number of organisms ingested between individuals (Haas, 2002), impacting model assumptions. Other sources of uncertainty and variability in enumeration data include sample representativeness, recovery efficiencies, detection limits, microbial kinetics such as resistance, die-off and growth, and differentiation between strains (Petterson et al., 2006, Smid et al., 2011). Although the QMRA concept and framework is widely recognized, the limitations in available data and models are acknowledged by authorities (World Health Organisation, 2008), and there is recognition of the scope for current approaches to be enhanced and complemented by alternative techniques (Havelaar, 2012, World Health Organisation, 2008).
A third key issue is that environmental systems that can be modelled using QMRA are characterized by significant levels of uncertainty and variability, due to complex ecology, population dynamics, and the multiple physicochemical and biotic influences at play. Uncertainty signifies the degree of accuracy and precision with which a quantity is measured, and can be characterized and reduced by altering the model and/or collecting more data. In contrast, variability, as a feature of natural systems, can also be characterized, but cannot be reduced (National Research Council, 2009). The precision and consequent usefulness of a numerical risk assessment rests on its ability to indicate, separate and evaluate the uncertainty and variability of the estimate (Lammerding, 1997, Vose, 2000). Thus, the separation and characterisation of the uncertainty and variability of model parameters, is now widely recommended in risk assessment (Codex Alimentarius Commission, 1999, Food and Agriculture Organization of the United Nations and World Health Organization, 2003, Vose, 2000).
There is increasing interest in viewing the microbial risk pathway as a system, such as the modular process risk modelling process (MPRM) proposed by Nauta (2001) and Nauta et al. (2007), in which discrete processes or events in the risk pathway are represented as linked modules. A systems approach encompasses both holistic and modular views, enables the synthesis of knowledge of the parts in order to help understand the whole and makes a complex system more manageable (Auyang, 2004). A MPRM aims to model the transmission of micro-organisms along the food pathway by breaking down the pathway into consecutive modules and then modelling the basic microbial processes that take place in each module. For example, the dynamics of Salmonella in the ‘farm-to-fork’ pork slaughter chain can be described by the six basic MPRM processes of growth, inactivation, mixing, partitioning, removal and cross-contamination. At least one of these six basic processes is assigned to each key step in the chain, such as killing, scalding and dehairing, for modelling purposes (Smid et al., 2011).
BNs are one of a range of modelling tools that offer a systems perspective, with the added advantage of conditional dependence of the modules. Although other techniques such as MC simulation of standard QMRA models share commonalities with BNs, including the expression of parameter uncertainty using distribution functions, and visualisation as network graphs, a BN conveniently infers immediate changes in parameter values when new evidence is added (Greiner et al., 2013). The attraction of BNs includes the ability to address the three key issues outlined above: the scarcity of dose–response data and uncertainty in dose–response models, the difficulties with modelling exposure pathways due to complexity and lack of data and the necessity, in order to produce an informative risk estimate, to characterize and separate uncertainty and variability. In addition BNs offer a number of other features which are useful in QMRA, and which are described below.

2.2. Bayesian networks

A BN is a form of graphical model with variables represented by nodes, and connections between the variables represented by directed arcs (Jensen and Nielsen, 2007). Each node category or ‘state’ is assigned a probability distribution conditional on its parent nodes; these distributions can be derived from empirical data, statistical models, simulations, published papers or reports, or from expert opinion (Pollino et al., 2007). Arcs linking the nodes represent dependencies, with the strength of the causal links represented by these conditional probabilities. The directed arcs and the constraint that the arcs cannot form cycles or feedback loops within the model mean that the BN is part of a specific family of graphical models known as Directed Acyclic Graphs (DAGs). The majority of BNs are not time dependent, i.e., are ‘static’ in time, but nodes representing past events can be included, and the requirement to incorporate time as a variable can be met with object-oriented and dynamic BNs (Johnson et al., 2010, Johnson and Mengersen, 2011). An example of a simple BN indicating factors influencing microbial growth is shown in Fig. 2.
  1. Download: Download full-size image

Fig. 2. Example of simple Bayesian network indicating causal factors for microbial growth.

By their construction, BNs are able to characterize and quantify a complex outcome, as well as well as describe the many possible interactions between variables associated with the outcome (Donald et al., 2009). BNs can be used for ‘forward inference’, by which the inputs are specified and the impact on the outcome is observed, and ‘backwards reasoning’, by which the outcome is specified and the states of the system's variables required to obtain that outcome are calculated. This ability to undertake ‘backwards reasoning’ is due to Bayes' theorem (Jensen and Nielsen, 2007, Pearl, 2000). BNs are thus able to reveal variables that are major drivers for an outcome, or conversely, the sensitivity of the outcome to variables in the network (Ben-Gal, 2007, Coupé et al., 2000, Pollino and Hart, 2005). Uncertainty is explicitly represented in a BN, as each node or variable is represented as a probability distribution. This is particularly important in environmental systems, where uncertainty can be widespread (Aguilera et al., 2011).
Another useful feature of BNs is the faculty for ‘structure learning’, or the automatic derivation of the graph structure, either whole or in part, directly from a data set. This ability for BN structure to be directly induced from data is well established and has considerably increased the potential applications of BNs (Aguilera et al., 2011). Uusitalo (2007) argues however, that BN structures cannot be reliably estimated based on data as environmental systems include significant uncertainty and variability, and further, that postulated beliefs about causal connections generally produce better models.
BNs have been used in environmental modelling for some years (Varis and Kuikka, 1999), although the full potential of these models in this field is thought to be largely untapped (Aguilera et al., 2011). In particular, they are only just emerging as popular models for describing the complex relationships between environmental exposures and health. Düspohl et al. (2012) maintain ‘BNs have the potential to become a core method of transdisciplinary research and knowledge integration in environmental management’.

2.3. Bayesian networks in QMRA

BNs were proposed for QMRA over a decade ago (Barker, 2004). Their appearance in the field began and has predominated in food safety risk assessment, where they have been used to model one component of a QMRA or a whole food production chain (Greiner et al., 2013). A stochastic QMRA, consisting of a set of biological and/or process-related variables and the mathematical equations defining their dependencies may be considered as a BN (Rigaux et al., 2012a, Smid et al., 2010). In other applications of a BN, a QMRA may provide inputs into a BN, or a BN may be used to augment a QMRA (Donald et al., 2009). Greiner et al. (2013) affirm that an entire QMRA model can be formulated as a BN using the same mathematical equations as an MC model but implemented in a network which includes the joint distribution of all variables in the model. In general, Bayesian methods provide a flexible and powerful approach to QMRA and risk assessment modelling, with the caveat that implementation may be more challenging in practice than MC modelling (Greiner et al., 2013).
Applications of BNs in a QMRA have been described by previous authors in the food safety domain (Greiner et al., 2013, Parsons et al., 2005, Smid et al., 2010), but to the best of our knowledge, this is the first review of their use in the QMRA context across domains. Further elaboration of the advantages and drawbacks of BNs in QMRA occurs in the Discussion section.

3. Method

Published studies were identified in a search of the Web of Science, Scopus, PubMed, SpringerLink and Informit databases, focusing on the terms ‘Bayesian network’, ‘Bayesian belief network’, or ‘Bayesian graphical model’, used in conjunction with the terms ‘QMRA’, or ‘microb*’. Quantitative microbial risk assessments for foodborne or waterborne pathogens where BNs were used, with or without the use of Bayesian statistical methods, were included in this review. The literature describing applications of BNs in QMRA is not extensive, demonstrating the novelty of the approach in this particular domain. Fifteen papers were selected for inclusion based on their relevance to these search criteria. The studies were examined firstly to determine the study domain, aims and application of the method to QMRA. Closer scrutiny was undertaken to determine the knowledge source/s for the BN model structure, source of conditional probability table values, techniques used for validating the model and for belief updating (sometimes referred to as probabilistic inference). Finally, the identified functions of the BN (for example, prediction, separation of uncertainty and variability, scenario assessment and decision-making) and other gains made in the course of the analysis e.g., software or new method development were ascertained. This information has been summarized in Table A.1 in Appendix A.

4. Results

Of the 15 peer-reviewed journal papers examined, 10 were published within the last 5 years. Eleven articles pertained to microbial risk assessment of foodstuffs, and the remaining 4 related to waterborne microbial hazards. The number of nodes in the BNs varied widely, from 6 to 63.
In the following synopsis, each of these articles is now introduced and briefly discussed. The issues that they raise, in particular the strengths and drawbacks of the BN approach for QMRA, are then drawn together in a summary discussion.

4.1. Foodborne microbial risk assessments

The primary purpose of a risk assessment by Barker et al. (2002) was the translation of the QMRA to risk management decisions. A model of a nonspecific food manufacturing process was developed to represent two components of foodborne botulism (spore concentration and bacterial growth), in terms of contamination processes, spore thermal death kinetics, germination and growth of cells, toxin production and patterns of consumer behaviour. A BN was used to include such diverse information sources as operating experience, with low quality experimental data such as negative spore counts.
Microbial growth variability and uncertainty is a key source of microbial risk variability and uncertainty, which led Pouillot et al. (2003) to propose a method to estimate growth curve parameters of Listeria monocytogenes in milk, using published data. The primary aim of this BN application was to model separately and evaluate uncertainty and variability by means of hyperparameters, improving the growth model parameter estimation for risk assessment purposes.
In a QMRA quantified by a review of the scientific literature and industry practices, Parsons et al. (2005) used a BN to make estimates of the prevalence of salmonella-positive birds in a flock, and inferences about system variables, in order to reduce the Salmonella contamination rate in the final product of a poultry production chain. The QMRA was subsequently used as a basis on which to compare three modelling approaches for quantitative risk assessment.
Delignette-Muller et al. (2006) modelled the effects of time and temperature on competing growth rates of L. monocytogenes and food flora, as part of a larger collaborative project assessing exposure to the pathogen in cold-smoked salmon. The BN in this case accounted for the main sources of variability and uncertainty in these predictive microbiology models, thereby increasing the accuracy and validity of the models for the QMRA.
Albert et al. (2008) estimated the probability of contracting campylobacteriosis as a result of broiler contamination in a food production chain using a BN. In this instance a core stochastic model based on current or prior knowledge was built using only expert opinions and scientific literature. After an initial validation, the model was augmented with relevant data where it was available. The model illustrated the power of the Bayesian approach, particularly against a background of scarce data, as it enabled the combination of data with other disparate sources of information.
Articles by Smid et al., 2011, Smid et al., 2012 describe the development and use of a BN to trace sources of contamination (biotracing) for individual Salmonella-positive pig carcasses in a slaughterhouse. The purpose of the model was to allow plant operators to prioritize decontamination measures. To achieve biotracing, a model must be able to answer questions in the reverse direction of the chain processing order, which requires the incorporation of multiple pieces of evidence to update the statistics of the model parameters. BNs allow for such inferential queries and are therefore an appropriate choice of model for biotracing, due to their ability to use downstream information to point to materials, processes, or actions within a particular food chain that can be identified as the contamination source. This model demonstrates the concept of biotracing, gives insight into the dynamics of Salmonella in the slaughter line and indicates where in the line data collection is most effective for biotracing.
In a QMRA undertaken by Rigaux et al. (2012a), genetic diversity and the variation in concentrations of Bacillus cereus with time and temperature in a processing chain for courgette puree were studied. The BN modelled batch-specific variability separately from uncertainty and enabled backward calculation to update the experts' knowledge about the microbial dynamics of the pathogen using experimental data. The results included improvement of prior beliefs about the dynamics of the foodborne pathogen and reduction in uncertainty.
Meta-analyses are increasingly being performed in QMRAs for food safety and quality, to estimate the inactivation or growth parameters of micro-organisms of concern, in order to generate sufficiently generic parameters, with their variability, which can be used in further quantitative risk assessments. Rigaux et al. (2012b) employed a BN to address a persistent problem in canned food processing, microbial spoilage by Geobacillus stearothermophilus. The BN was used to estimate the thermal inactivation parameters of the pathogen, using a meta-analysis of reference inactivation parameters for the organism, to take advantage of the large quantity of data in the scientific and grey literature.
Smid et al. (2013) used a BN to obtain an accurate estimate for the transfer ratio of bacteria from one surface to another during pork cutting in a processing chain, by incorporating uncertainty from one experiment and variability from multiple experiments into one model. Benefits included improved insight into biological parameters such as recovery ratios, pathogen count data and transfer ratios, and a correct representation of their uncertainty, producing better QMRA models. The researchers attested that current approaches, in which uncertainty originating from limited count data is often neglected, lead to inconsistencies and an underestimation of the total uncertainty in a model.
A key driver for innovation in the UK dairy sector is the ability to deal rapidly with zoonotic hazards due to negative publicity. Barker and Gomez-Tome (2013) modelled Staphylococcus aureus in milk in terms of pathogen concentrations, population growth and enterotoxin production, as well as effects of cooling and storage on growth, and alkaline phosphatase as an indicator of potential hazards. This BN also enabled food chain biotracing, indicating three potential causes of S. aureus contamination by propagating effects of particular end point observations to express posterior beliefs about possible causes.

4.2. Waterborne microbial risk assessments

Donald et al. (2009) developed a BN as a supplementary analysis to a QMRA, which described a conceptual model for health risks associated with recycled water, with a chosen health endpoint of gastroenteritis. The BN was useful in identifying the nodes with the most influence on the incidence of gastroenteritis. By calculating credible intervals the authors contributed a method for quantifying the uncertainty of point estimates arising from the BN, adding to the available tools for assessing microbial health risk as a result of environmental exposures.
Goulding et al. (2012) used a BN to increase understanding of the public health impacts of sewer overflows in wet weather, in order to prioritize management options. QMRA was used to identify the threats to the waterway values and the relationships between the variables for inclusion in the BN. The network model enabled the effectiveness of various sewer overflow management options in reducing the public health risk to be determined through the application of probabilistic inference, and the model was also able to account for the uncertainty inherent in such events and their subsequent impacts.
In a QMRA for waterborne pathogens in a freshwater lake, a Bayesian model was developed using concentrations of faecal indicator bacteria (FIB), frequency of pathogen detection and physicochemical parameters such as temperature and salinity to determine factors predictive of human health risk (Staley et al., 2012). The authors concluded that BN modelling of physical and bacterial parameters can be useful in predicting conditions under which low or high risk of pathogen presence exists, making the tool valuable in applications such as water quality monitoring at beach and shell fishing areas.
In a similar recreational water setting, the potential threat of faecal contamination was assessed using a BN to explore differences between analytical methods (most probable number (MPN) and colony-forming units (CFU)) for quantifying FIB concentrations, and between different sampling locations and times (Gronewold et al., 2011). The aim was to reduce uncertainty in water resource management decisions by fully understanding and accounting for methodological variability associated with FIB quantification methods, and to improve the estimation and representation of FIB inactivation rates. Comparison of a conventional model of bacterial inactivation rates with a novel Bayesian model revealed that the latter provided a more robust approach to quantifying uncertainty in microbiological assessments of water quality than the conventional MPN-based model and therefore reduced uncertainty in water resource management decisions.

5. Discussion

The synopsis above, along with the summary presented in Table A.1 in Appendix A, clearly demonstrates the diversity and utility of BNs in the exploration of microbial risk. In general, the research on foodborne pathogens aimed to solve highly specific tactical or operational types of decision problems (Sutherland, 1983) over short term time scales. In contrast, the research environments for waterborne pathogens were spatially larger, used aggregated state indicators and aimed to solve directive or strategic types of decision problems (Sutherland, 1983) over longer time scales.

5.1. QMRA focus

In all of the studies, BNs were used in the context of QMRA to achieve the aim of quantifying an aspect of the microbial hazard and making predictions, although minimising risk through scenario assessment and informing management options was described in only 10 articles. Seven of the 15 studies reported using the method for the separation of uncertainty and variability, and 8 mentioned reduction of uncertainty as a benefit of using a BN. Six of the 15 articles reported developing a new method or new software during the course of the research.
In the majority (11) of the 15 studies, the BN was used to investigate a fragment of the exposure pathway, such as the environmental influences on pathogen concentration, parameters of bacterial growth models, or pathogen enumeration issues such as recovery efficiencies. In the publications which did not describe a complete QMRA, it was often not clear whether the published material describing the BN application comprised the entire risk assessment. Studies by Donald et al. (2009) and Albert et al. (2008) incorporated the hazard identification, dose–response, exposure and risk characterisation steps of a QMRA. In one article (Goulding et al., 2012), a QMRA was used to provide inputs for their developed BN.

5.2. Foodborne risk assessments

Although most of the QMRA studies modelled specific modules of the food chain, and thereby a component of the QMRA process in detail, Albert et al. (2008) began with a simplistic model of the entire food chain including consumption, improving certain points gradually with new evidence, which was subsequently propagated throughout the BN to maintain the overall veracity of the model.
The applications describing a fragment of the foodborne pathogen exposure pathway were very detailed, comprising extensive analyses of high resolution data. For instance, 8 of the 11 risk assessments of foodborne pathogens focused on the dynamics of microbial populations affecting an endpoint of pathogen or spore concentration, using a BN to improve estimates of variables such as growth and resistance parameters, or to examine the variation of populations with time and temperature. Another innovative purpose identified for BNs to augment QMRA is biotracing, the identification of sources of bacterial contamination in a chain of events such as a food production line (Smid et al., 2011). Five of the 11 foodborne risk studies used a BN to achieve source-level inference, or biotracing.

5.3. Waterborne risk assessments

Of the 4 BNs used to assist in the description of waterborne pathogen risk, two chose pathogen presence/absence (Staley et al., 2012) or 'true' but unobserved FIB concentration (Gronewold et al., 2013) as an endpoint. In a similar manner to the foodborne risk assessments with pathogen concentration endpoints, Staley et al. (2012) used a BN to undertake source tracking in order to identify faecal contamination sources in a freshwater lake, and Gronewold et al. (2011) used a BN to compare E. coli and Enterococcus dark inactivation rates. The endpoint chosen by Donald et al. (2009) was gastroenteritis, whereas Goulding et al. (2012) expressed the risk to human health in terms of the threat to five waterway uses (Goulding et al., 2012).

5.4. BN procedures

Of the 11 studies which stated their sources for the structure of the BN, 7 used a combination of sources, and 4 used a single source to inform the model structure. There was a notable absence of accounts of structure learning from data, apart from that reported by Staley et al (2012). Equal numbers of studies (6) used empirical data or expert opinion to quantify conditional probability tables. Model validation was carried out principally using data or by sensitivity analysis, with 2 studies using existing models to validate their BN and 2 using expert evaluation. Validation procedures reported included conventional regression analysis, 10-fold cross validation and a ‘leave one out’ cross-confirmation procedure. In almost all cases, belief updating was achieved through the data, with 1 study also using expert opinion.
On the whole, explicit discussion of model validation, discretisation, belief updating and knowledge sources for conditional probability tables was uncommon, and obscured by a lack of uniformity in terminology and structured detail. There was a wide variation in implied meanings of commonly used terms such as ‘Bayesian approach’, ‘Bayesian methodology’, ‘data’, ‘parameters’ and ‘variables’, and such terms were rarely defined by authors. Furthermore, while the literature was clear on modelling and statistical approaches, it was less clear on the mechanics of empirical data being incorporated into conditional probability tables and on specific aspects of the application of the method in QMRA. The adoption of a standardized approach to the reporting of studies using Bayesian networks, as well as agreement on and use of a universal terminology would improve accessibility of this technique to multidisciplinary teams.
Based on the reviewed studies, primary advantages and drawbacks of the use of BNs for QMRA are now discussed.

5.5. Advantages of using Bayesian networks in QMRA

A BN is a natural framework for combining results of a QMRA with results from epidemiological studies. BNs facilitate the impartial, systematic combination of disparate information sources (Albert et al., 2008, Barker et al., 2002). Data can comprise point estimates, probability distributions, field observations, published results or expert opinion. Due to their ability to incorporate diverse data types, a BN enables a complex, multivariate statistical problem (such as QMRA) to be efficiently addressed where classical statistical methods are often inept (Albert et al., 2008).
A particularly important feature with respect to QMRA is that poor quality experimental data has little impact on a distribution function for pathogen concentration that is established from high quality prior information (Barker et al., 2002, Kuikka et al., 1999). Furthermore, accurate predictions can be made with incomplete data (Fenton and Neil, 2013), or quite small sample sizes (Kontkanen et al., 1997).
As discussed previously, BNs by their nature allow ‘backwards reasoning’. This means that when given evidence about an effect or outcome node, subsequent changes in the causal nodes can be observed. Standard MC approaches cannot use data sets downstream of other data sets (Albert et al., 2008). This feature enables determination of a diagnostic probability as opposed to a causal probability (Barker et al., 2009, Greiner et al., 2013). Moreover, new evidence can serve to update prior distributions, positively or negatively adjusting the initial beliefs of the expert (Greiner et al., 2013).
A BN responds immediately to changes in the network such as entering new evidence, because it does not use simulation. Information may be propagated from any point in the network to all others by Bayesian inference (Parsons et al., 2005). The efficiency of a BN framework is also evident in the systematic representation of the joint probability in such a system, which significantly reduces complexity (Smid et al., 2010). Moreover, new evidence from multiple variables in the network can be used to update the estimates of the unobserved parameters in the model (Smid et al., 2010).
Interventions to reduce risk can be simulated in the network by changing parameter values, enabling calculation of risk reduction. An adequately validated BN provides a clear understanding of the effect of interventions on the outcomes of interest by altering the prior distributions of variables to simulate a new risk-reduction strategy (Albert et al., 2008). Verifying the effect of control measures by simulation greatly improves the visibility and efficiency of decision making, with the potential for reducing costs (Liu et al., 2013).
The visual representation in a BN of large quantities of complex information provides an informative platform for improved communications across disciplines between mathematical modellers, domain experts and stakeholders (Aguilera et al., 2011, Barker et al., 2002). Although their graphical structure is not a unique feature, (Smid et al., 2010), environmental and biotic causal influences on the outcome of interest can be clearly represented and easily visualised. This feature is particularly relevant in QMRA methodology, where representation of a complex system is required in conjunction with transparent modelling of such process influences, as many of the leading-edge numerical tools employed in QMRA may not be accessible or transparent to non-technical team members (Smid et al., 2011).

5.6. Challenges of using Bayesian networks

Quantification of the conditional probability tables underlying each node in a BN can be challenging. Empirical data, from field or laboratory observations or from the relevant literature, require manipulation to determine conditional probabilities (Fenton and Neil, 2013). The alternative use of information elicited from experts to determine these conditional probabilities may also present a significant challenge (Düspohl et al., 2012, Newton, 2009), including representing these initial beliefs as a joint prior distribution over all the nodes in the BN, incorporating sufficient transparency and rigour in the elicitation process (Pollino and Hart, 2005) and overcoming the convictions of scientists to work with beliefs and probabilities when they are familiar with observational data and classic statistical methods (Düspohl et al., 2012, Uusitalo, 2007).
A second challenge is the representation of probability distributions, either as continuous or discretised distributions. The discretisation of continuous variables introduces errors in the marginal distributions which can accumulate, producing an inaccurate representation of the model distributions (Parsons et al., 2005, Smid et al., 2010). Smid et al. (2010) assert that the discretisation of continuous variables leads to less accurate models, particularly when the distribution tails are critical (as in the case of low pathogen concentrations), as these are not explored in detail. However an important development in the advancement of BNs has been the introduction of hybrid models, in which continuous and discrete variables can coexist (Aguilera et al., 2011). For continuous distributions, another limitation in the Bayesian approach is that the assumed distributional form of the priors may not be appropriate (Mitchell-Blackwood et al., 2012). The problem of subjectivity in model and parameter selection has also been identified as a drawback by Soller (2008).
The inability of BNs to support feedback loops due to their acyclic nature is a concern mentioned frequently in the literature (Jensen, 2001, McCann et al., 2006, Nyberg et al., 2006). This issue can be surmounted by the implementation of a dynamic BN (Johnson et al., 2010, Johnson and Mengersen, 2011, Smid et al., 2010), although Smid et al. (2011) caution that these models can become so large as to be computationally infeasible (Smid et al., 2011).
The choice of software used for the BN will determine whether certain procedures can be performed or not; for example, some software packages allow hybrid models, comprising continuous and discrete variables, and some do not. Likewise, certain types of model validation, e.g., Cross Validation, can only be performed if the software accommodates it (Aguilera et al., 2011). Aguilera et al. (2011) also note that if the research is interdisciplinary, with both environmental and computer/mathematical roles, these software limitations may not pose a problem.

6. Conclusions

This appraisal of the use of BNs in assessing risk from microbial exposure provides a contextual background for the consideration of future frameworks and methods in this area. BNs provide a suite of attributes, including flexibility, the modular representation of a complex multivariate problem, the ability to integrate different forms of information and to account for uncertainty and variability, and the ability to make inferences using ‘downstream’ data. These features simplify scenario analysis during risk assessment and enable adaptive management, while the convenient graphical interface promotes ownership and communication among stakeholders and multidisciplinary research teams.
The drawbacks of BNs, including challenges with eliciting conditional probabilities and representation of spatial and temporal variability may depend on the complexity and scope of the risk question and can be overcome at least in part, by using them as an adjunctive modelling tool. The published research on microbial risk assessments with BN applications would benefit from increased emphasis on procedural transparency, organizational rigour and the inclusion of supplementary material with the primary article. We consider that BNs have great potential for wider use in QMRA, for the protection of human health.

Role of the funding source

This work was supported by an Australian Postgraduate Award. The Australian federal government had no involvement in this project, or in the decision to submit the article for publication.

Appendix A.

Table A.1. Summary of applications of Bayesian networks in quantitative microbial risk assessment.

ReferenceDomainKnowledge source informing model structureSource of conditional probability table valuesNumber of nodesSoftwareModel validation methodBelief updatingQuantifies hazardPredictionSeparation of uncertainty and variabilityUncertainty reduction reportedScenario assessment and decision makingSoftware developedNew method
Barker et al. (2002)QRA for food-borne pathogen — hazards arising from Clostridium botulinum growth and toxin productionLiterature, expert knowledgeEmpirical data, model simulations10 (spore concentrations), 23 (bacterial growth)HuginSensitivity analysisDatayyNot discussedyynn
Pouillot et al. (2003)QRA for food-borne pathogen — Listeria monocytogenes in milkLiterature, expert knowledgeModel simulations29WinBUGSBased on existing modelsData, maximum likelihood estimatesyyyyNot discussedny
Parsons et al.(2005)QRA for a food-borne pathogen — Salmonella spp. in poultry meat production chainExpert knowledgeLiterature, expert opinion, other unpublished sources23 nodes in Fig. 1, but ‘the final model consisted of 60 parameters’.1. Netica 2. WinBUGSOutput compared with survey data, sensitivity analysis1. Data2. datayy1. n 2. y1. y 2. yyn
Delignette-Muller et al. (2006)QRA for a food-borne pathogen — exposure assessment for Listeria monocytogenes in cold-smoked salmonLiterature, observations, expert knowledgeNot discussedL. monocytogenes model: 19 nodes including 7 hyperparamters and 2 covariates; food flora model: 18 nodes including 9 hyperparameters and 2 covariatesWinBUGSDataNot discussedyyyyNot discussednn
Albert et al. (2008)QRA for food-borne pathogens — estimating the probability of campylobacteriosis caused by home consumption of chicken meatExpert knowledgeModel simulations, expert opinion24 (augmented core model)JAGS and WinBUGSData, expert evaluation, sensitivity analysisDatayyNot discussedNot discussedyny
Smid et al.(2012); Smid et al. (2011)QRA for food-borne pathogen — Salmonella in the pork slaughter chainLiterature, observations, expert knowledgeEmpirical data, expert opinion, literature, model simulations63HuginSensitivity analysis, based on existing modelsDatayyNot discussedyyyn
Rigaux et al.(2012a)QRA for food-borne pathogen — Bacillus cereus in courgette pureeNot discussedNot discussed58JAGS 2.1.0.DataMCMC algorithmyyyyyny
Rigaux et al. (2012b)QRA for food-borne pathogen — Geobacillus stearothermophilus in the spoilage of canned foodsLiterature (peer reviewed and unpublished data)Not discussedBasic model: 8; intermediate model: 10; complete model: 14Jags 3.2.010-fold cross-validationModel simulations using MCMCyyynNot discussednn
Smid et al. (2013)QRA for food-borne pathogen — Salmonella in pork production chainNot discussedEmpirical data21 variablesHuginDataData — sequential adaptationyynyNot discussednn
Barker and Gomez-Tome (2013)QRA for food-borne pathogens — enterotoxigenic Staphylococcus aureus in milkNot discussedEmpirical data, (published and observed), expert opinion35 parametersHuginChecked model output with some published dataNot discussedyyyNot discussedynn
Donald et al. (2009)Estimating potential health risks associated with recycled waterExpert knowledgeExpert opinion14Netica and Hugin (model 1); WinBUGS (model 2)Sensitivity analysisExpert opinion, datayynnyny
Gronewold et al. (2011)Assessment of the potential threat of faecal contamination in surface waterLiterature, observations, model simulationsNot discussed17WinBUGSCompared conventional regression analysis, leave one out; cross-confirmation procedureDatayyyyyny
Goulding et al. (2012)Environmental engineering/public health — assessment of public health risk from wet weather sewer overflowsLiteratureEmpirical data (published and observed), expert opinion, modelling14Not discussedExpert evaluation, sensitivity analysisDatayynnynn
Staley et al. (2012)QRA for waterborne pathogens in a freshwater lakeExpert knowledge, machine learningEmpirical data6HuginNot discussedDatayynNot discussedynn

References

Cited by (0)

View Abstract